Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
نویسندگان
چکیده
This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, quantisation Xilinx’s programmable Z7020 FPGA hardware. Evolving models increases inference accuracy by up to 82% at cost 38% increased throughput performance 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot 3 bit precision in trade-off between requirements, accuracy. Parallelising implementations 2 quantised from 6 k FPS 373 FPS, 62× speedup.
منابع مشابه
Data-Free Knowledge Distillation for Deep Neural Networks
Recent advances in model compression have provided procedures for compressing large neural networks to a fraction of their original size while retaining most if not all of their accuracy. However, all of these approaches rely on access to the original training set, which might not always be possible if the network to be compressed was trained on a very large dataset, or on a dataset whose relea...
متن کاملOptimising Quantisation Noise in Energy Measurement
We give a model of parallel distributed genetic improvement. With modern low cost power monitors; high speed Ethernet LAN latency and network jitter have little effect. The model calculates a minimum usable mutation effect based on the analogue to digital converter (ADC)’s resolution and shows the optimal test duration is inversely proportional to smallest impact we wish to detect. Using the ex...
متن کاملHybrid Evolutionary Algorithm with Product-Unit Neural Networks for Classification
In this paper we propose a classification method based on a special class of feed-forward neural network, namely product-unit neural networks, and on a dynamic version of a hybrid evolutionary neural network algorithm. The method combines an evolutionary algorithm, a clustering process, and a local search procedure, where the clustering process and the local search are only applied at specific ...
متن کاملHardware Accelerated Apriori Algorithm for Data Mining
The Apriori algorithm is a popular correlation-based datamining kernel. However, it is a computationally expensive algorithm and the running times can stretch up to days for large databases, as database sizes can extend to Gigabytes. Through the use of a new extension to the systolic array architecture, time required for processing can be significantly reduced. Our array architecture implementa...
متن کاملA Hierarchy Topology Design Using a Hybrid Evolutionary Algorithm in Wireless Sensor Networks
Wireless sensor network a powerful network contains many wireless sensors with limited power resource, data processing, and transmission abilities. Wireless sensor capabilities including computational capacity, radio power, and memory capabilities are much limited. Moreover, to design a hierarchy topology, in addition to energy optimization, find an optimum clusters number and best location of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Electronics
سال: 2021
ISSN: ['2079-9292']
DOI: https://doi.org/10.3390/electronics10040396